Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SQL parser prototype #338

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

Add SQL parser prototype #338

wants to merge 6 commits into from

Conversation

linhr
Copy link
Contributor

@linhr linhr commented Dec 26, 2024

No description provided.

Copy link

Spark Test Report

Commit Information

Commit Revision Branch
After de87520 refs/pull/338/merge
Before a574315 refs/heads/main

Test Summary

Suite Commit Failed Passed Skipped Warnings Time (s)
doctest-column After 4 29 3 6.08
Before 4 29 3 5.58
doctest-dataframe After 34 72 1 4 8.10
Before 34 72 1 4 7.44
doctest-functions After 179 224 6 8 11.71
Before 179 224 6 8 11.55
test-connect After 272 757 135 262 125.46
Before 272 757 135 262 125.68

Test Details

Error Counts
          489 Total
          273 Total Unique
-------- ---- ----------------------------------------------------------------------------------------------------------
           29 DocTestFailure
           15 UnsupportedOperationException: streaming query manager command
           13 AssertionError: AnalysisException not raised
           13 UnsupportedOperationException: lambda function
           10 UnsupportedOperationException: unsupported data source format: Some("text")
           10 handle add artifacts
            8 PySparkAssertionError: [DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal:
            8 UnsupportedOperationException: hint
            7 AssertionError: False is not true
            6 UnsupportedOperationException: function: window
            6 UnsupportedOperationException: write stream operation start
            5 AnalysisException: Cannot cast to Decimal128(14, 7). Overflowing on NaN
            5 UnsupportedOperationException: function: monotonically_increasing_id
            4 AnalysisException: Schema contains duplicate qualified field name range."#0"
            4 AnalysisException: Schema contains duplicate qualified field name t."#2"
            4 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "Error during planning: No table named 'v'"
            4 PySparkNotImplementedError: [NOT_IMPLEMENTED] rdd() is not implemented.
            4 PythonException:  KeyError: 0 KeyError: 0
            4 UnsupportedOperationException: TryFrom spec::DataType::Interval(DayTime) to Spark Kind
            4 UnsupportedOperationException: sample
            4 UnsupportedOperationException: sample by
            4 UnsupportedOperationException: unknown aggregate function: hll_sketch_agg
            4 UnsupportedOperationException: unpivot
            3 AnalysisException: Error during planning: Error during planning: spark_array does not support zero a...
            3 AnalysisException: Error during planning: The expression to get an indexed field is only valid for `...
            3 AnalysisException: Execution error: 'Utf8("INTERVAL '0 00:00:00.000123' DAY TO SECOND") = CAST(#1 AS...
            3 AssertionError: AnalysisException not raised by <lambda>
            3 AssertionError: Lists differ: [Row(a=1, a=1, b='x')] != [Row(a=1, b='x')]
            3 UnsupportedOperationException: function: input_file_name
            3 UnsupportedOperationException: function: pmod
            3 UnsupportedOperationException: function: ~
            3 UnsupportedOperationException: handle analyze input files
            3 ValueError: Converting to Python dictionary is not supported when duplicate field names are present
            2 AnalysisException: Error during planning: The expression to get an indexed field is only valid for `...
            2 AnalysisException: Error during planning: cannot resolve attribute: ObjectName([Identifier("a"), Ide...
            2 AnalysisException: Error during planning: two values expected: [Column(Column { relation: Some(Bare ...
            2 AnalysisException: Invalid or Unsupported Configuration: Could not find config namespace "spark"
            2 AssertionError
            2 AssertionError: "Exception thrown when converting pandas.Series" does not match "
            2 AssertionError: '[1 2 3]' != '[1, 2, 3]'
            2 AssertionError: '[array([1, 2], dtype=int32) array([3, 4], dtype=int32)]' != '[[1, 2], [3, 4]]'
            2 AssertionError: Lists differ: [Row([22 chars](key=1, value='1'), Row(key=10, value='10'), R[2402 cha...
            2 AssertionError: Lists differ: [Row(a=1, a=1, b=2)] != [Row(a=1, b=2)]
            2 IllegalArgumentException: expected value at line 1 column 1
            2 IllegalArgumentException: invalid argument: empty data source paths
            2 PythonException:  TypeError: unsupported type for timedelta microseconds component: datetime.timedel...
            2 SparkRuntimeException: Internal error: start_from index out of bounds.
            2 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            2 UnsupportedOperationException: PlanNode::IsCached
            2 UnsupportedOperationException: TryFrom spec::DataType::Timestamp { time_unit: Second | Millisecond |...
            2 UnsupportedOperationException: approx quantile
            2 UnsupportedOperationException: collect metrics
            2 UnsupportedOperationException: decimal literal with precision or scale
            2 UnsupportedOperationException: freq items
            2 UnsupportedOperationException: function: bitmap_bit_position
            2 UnsupportedOperationException: function: crc32
            2 UnsupportedOperationException: function: dayofweek
            2 UnsupportedOperationException: function: encode
            2 UnsupportedOperationException: function: format_number
            2 UnsupportedOperationException: function: from_csv
            2 UnsupportedOperationException: function: from_json
            2 UnsupportedOperationException: function: inline
            2 UnsupportedOperationException: function: map_from_arrays
            2 UnsupportedOperationException: function: sec
            2 UnsupportedOperationException: function: shiftrightunsigned
            2 UnsupportedOperationException: function: timestamp_seconds
            2 UnsupportedOperationException: handle analyze same semantics
            2 UnsupportedOperationException: pivot
            2 UnsupportedOperationException: position with 3 arguments is not supported yet
            2 UnsupportedOperationException: rebalance partitioning by expression
            2 UnsupportedOperationException: tail
            2 UnsupportedOperationException: unknown aggregate function: collect_set
            2 UnsupportedOperationException: unresolved regex
            2 UnsupportedOperationException: unsupported data source format: Some("orc")
            2 UnsupportedOperationException: user defined data type should only exist in a field
            2 handle artifact statuses
            2 received metadata size exceeds hard limit (19620 vs. 16384);  :status:42B content-type:60B grpc-stat...
            1 AnalysisException: Cannot cast string 'abc' to value of Float64 type
            1 AnalysisException: Cannot cast value 'abc' to value of Boolean type
            1 AnalysisException: Error during planning: Error during planning: Coercion from [Utf8, Int32, Boolean...
            1 AnalysisException: Error during planning: Execution error: User-defined coercion failed with Interna...
            1 AnalysisException: Error during planning: Failed to parse placeholder id: cannot parse integer from ...
            1 AnalysisException: Error during planning: Inconsistent data type across values list at row 1 column ...
            1 AnalysisException: Error during planning: UNION queries have different number of columns: left has 3...
            1 AnalysisException: Error during planning: cannot resolve attribute: ObjectName([Identifier("a"), Ide...
            1 AnalysisException: Error during planning: three values expected: [Column(Column { relation: Some(Bar...
            1 AnalysisException: Error during planning: three values expected: [Literal(Int32(1)), Literal(Int32(3...
            1 AnalysisException: Error during planning: two values expected: [Column(Column { relation: Some(Bare ...
            1 AnalysisException: Execution error: 'Utf8("1970-01-01 00:00:00") = CAST(#1 AS Utf8)' is not true!
            1 AnalysisException: Execution error: 'Utf8("2012-02-02 02:02:02") = CAST(?table?.#0 AS Utf8)' is not ...
            1 AnalysisException: Execution error: Error parsing timestamp from '2023-01-01' using format '%d-%m-%Y...
            1 AnalysisException: Execution error: Unable to find factory for TEXT
            1 AnalysisException: Execution error: map requires all value types to be the same
(+1)        1 AnalysisException: Invalid or Unsupported Configuration: could not find config namespace for key "ig...
            1 AnalysisException: Invalid or Unsupported Configuration: could not find config namespace for key "li...
            1 AnalysisException: Schema contains duplicate qualified field name "?table?"."#0"
            1 AnalysisException: Schema contains duplicate unqualified field name "#0"
            1 AssertionError: "2000000" does not match "Internal error: raise_error expects a single UTF-8 string ...
(+1)        1 AssertionError: "Database 'memory:82a39f19-6ace-4647-9693-4271b2f0093d' dropped." does not match "in...
(+1)        1 AssertionError: "Database 'memory:9dd9473d-18be-43b1-a87a-9268b4222d14' dropped." does not match "in...
            1 AssertionError: "foobar" does not match "Internal error: raise_error expects a single UTF-8 string a...
            1 AssertionError: "timestamp values are not equal (timestamp='1969-01-01 09:01:01+08:00': data[0][1]='...
            1 AssertionError: '+---[17 chars]-----+\n|                        x|\n+--------[132 chars]-+\n' != '+-...
            1 AssertionError: 0 != '0'
            1 AssertionError: 2 != 3
            1 AssertionError: BinaryType() != NullType()
            1 AssertionError: Exception not raised
            1 AssertionError: Lists differ: [Row([14 chars] _c1=25, _c2='I am Hyukjin\n\nI love Spark!'),[86 chars...
            1 AssertionError: Lists differ: [Row([22 chars]e(2018, 12, 31, 16, 0), aware=datetime.datetim[16 chars...
            1 AssertionError: Lists differ: [Row([259 chars]681098, ln(id)=1.0986122886681098, struct(id, [975 cha...
            1 AssertionError: Lists differ: [Row([49 chars] 1), internal_value=-31532339000000000), Row(i[225 char...
            1 AssertionError: Lists differ: [Row(key='0'), Row(key='1'), Row(key='10'), Ro[1439 chars]99')] != [Ro...
            1 AssertionError: Lists differ: [Row(name='Andy', age=30), Row(name='Justin', [34 chars]one)] != [Row(...
            1 AssertionError: Row(point=ExamplePoint([,1), pypoint=ExamplePoint([,3)) != Row(point='(1.0, 2.0)', p...
            1 AssertionError: Row(res="[('personal', [('name', 'John'), ('city', 'New York')])]") != Row(res="{'pe...
            1 AssertionError: StorageLevel(False, True, True, False, 1) != StorageLevel(False, False, False, False...
            1 AssertionError: Struc[30 chars]estampType(), True), StructField('val', IntegerType(), True)]) != Str...
            1 AssertionError: Struc[32 chars]e(), False), StructField('b', DoubleType(), Fa[158 chars]ue)]) != Str...
            1 AssertionError: Struc[40 chars]ue), StructField('val', ArrayType(DoubleType(), False), True)]) != St...
            1 AssertionError: Struc[64 chars]Type(), True), StructField('i', StringType(), True)]), False)]) != St...
            1 AssertionError: Struc[69 chars]e(), True), StructField('name', StringType(), True)]), True)]) != Str...
            1 AssertionError: YearMonthIntervalType(0, 1) != YearMonthIntervalType(0, 0)
            1 AssertionError: [1.0, 2.0] != ExamplePoint(1.0,2.0)
            1 AssertionError: {} != {'max_age': 5}
            1 AttributeError: 'DataFrame' object has no attribute '_ipython_key_completions_'
            1 AttributeError: 'DataFrame' object has no attribute '_joinAsOf'
(+1)        1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpn9q1th_z'
(+1)        1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpp7jcxu6l'
            1 IllegalArgumentException: 83140 is too large to store in a Decimal128 of precision 4. Max is 9999
            1 IllegalArgumentException: invalid argument: from_unixtime with format is not supported yet
            1 IllegalArgumentException: invalid argument: sql parser error: Expected: (, found: AS at Line: 1, Col...
            1 IllegalArgumentException: invalid argument: sql parser error: Expected: (, found: EOF
            1 KeyError: 'max'
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreach() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreachPartition() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] localCheckpoint() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] sparkContext() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] toJSON() is not implemented.
            1 PythonException: 
            1 PythonException:  AttributeError: 'NoneType' object has no attribute 'partitionId'
            1 PythonException:  AttributeError: 'list' object has no attribute 'x'
            1 PythonException:  AttributeError: 'list' object has no attribute 'y'
            1 PythonException:  ValueError: 0 is not in range KeyError: 0
            1 QueryExecutionException: Json error: Not valid JSON: EOF while parsing a list at line 1 column 1
            1 QueryExecutionException: Json error: Not valid JSON: expected value at line 1 column 2
            1 SparkRuntimeException: External error: Arrow error: Invalid argument error: column types must match ...
            1 SparkRuntimeException: Optimizer rule 'optimize_projections' failed
            1 SparkRuntimeException: type_coercion
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: COUNT DISTINCT with multiple arguments
            1 UnsupportedOperationException: Insert into not implemented for this table
            1 UnsupportedOperationException: SQL show databases
            1 UnsupportedOperationException: SQL show functions
            1 UnsupportedOperationException: bucketing
            1 UnsupportedOperationException: call function
            1 UnsupportedOperationException: deduplicate within watermark
            1 UnsupportedOperationException: function exists
            1 UnsupportedOperationException: function: array_insert
            1 UnsupportedOperationException: function: array_sort
            1 UnsupportedOperationException: function: arrays_zip
            1 UnsupportedOperationException: function: bin
            1 UnsupportedOperationException: function: bit_count
            1 UnsupportedOperationException: function: bit_get
            1 UnsupportedOperationException: function: bitmap_bucket_number
            1 UnsupportedOperationException: function: bitmap_count
            1 UnsupportedOperationException: function: bround
            1 UnsupportedOperationException: function: conv
            1 UnsupportedOperationException: function: convert_timezone
            1 UnsupportedOperationException: function: csc
            1 UnsupportedOperationException: function: date_from_unix_date
            1 UnsupportedOperationException: function: dayofmonth
            1 UnsupportedOperationException: function: dayofyear
            1 UnsupportedOperationException: function: decode
            1 UnsupportedOperationException: function: e
            1 UnsupportedOperationException: function: elt
            1 UnsupportedOperationException: function: format_string
            1 UnsupportedOperationException: function: from_utc_timestamp
            1 UnsupportedOperationException: function: getbit
            1 UnsupportedOperationException: function: hour
            1 UnsupportedOperationException: function: inline_outer
            1 UnsupportedOperationException: function: java_method
            1 UnsupportedOperationException: function: json_object_keys
            1 UnsupportedOperationException: function: json_tuple
            1 UnsupportedOperationException: function: last_day
            1 UnsupportedOperationException: function: localtimestamp
            1 UnsupportedOperationException: function: make_dt_interval
            1 UnsupportedOperationException: function: make_interval
            1 UnsupportedOperationException: function: make_timestamp
            1 UnsupportedOperationException: function: make_timestamp_ltz
            1 UnsupportedOperationException: function: make_timestamp_ntz
            1 UnsupportedOperationException: function: make_ym_interval
            1 UnsupportedOperationException: function: map_concat
            1 UnsupportedOperationException: function: map_from_entries
            1 UnsupportedOperationException: function: mask
            1 UnsupportedOperationException: function: minute
            1 UnsupportedOperationException: function: months_between
            1 UnsupportedOperationException: function: next_day
            1 UnsupportedOperationException: function: parse_url
            1 UnsupportedOperationException: function: printf
            1 UnsupportedOperationException: function: quarter
            1 UnsupportedOperationException: function: reflect
            1 UnsupportedOperationException: function: regexp_count
            1 UnsupportedOperationException: function: regexp_extract
            1 UnsupportedOperationException: function: regexp_extract_all
            1 UnsupportedOperationException: function: regexp_instr
            1 UnsupportedOperationException: function: regexp_substr
            1 UnsupportedOperationException: function: schema_of_csv
            1 UnsupportedOperationException: function: schema_of_json
            1 UnsupportedOperationException: function: second
            1 UnsupportedOperationException: function: sentences
            1 UnsupportedOperationException: function: session_window
            1 UnsupportedOperationException: function: sha
            1 UnsupportedOperationException: function: sha1
            1 UnsupportedOperationException: function: soundex
            1 UnsupportedOperationException: function: spark_partition_id
            1 UnsupportedOperationException: function: split
            1 UnsupportedOperationException: function: stack
            1 UnsupportedOperationException: function: str_to_map
            1 UnsupportedOperationException: function: timestamp_micros
            1 UnsupportedOperationException: function: timestamp_millis
            1 UnsupportedOperationException: function: to_char
            1 UnsupportedOperationException: function: to_csv
            1 UnsupportedOperationException: function: to_json
            1 UnsupportedOperationException: function: to_number
            1 UnsupportedOperationException: function: to_unix_timestamp
            1 UnsupportedOperationException: function: to_utc_timestamp
            1 UnsupportedOperationException: function: to_varchar
            1 UnsupportedOperationException: function: try_add
            1 UnsupportedOperationException: function: try_divide
            1 UnsupportedOperationException: function: try_element_at
            1 UnsupportedOperationException: function: try_multiply
            1 UnsupportedOperationException: function: try_subtract
            1 UnsupportedOperationException: function: try_to_binary
            1 UnsupportedOperationException: function: try_to_number
            1 UnsupportedOperationException: function: try_to_timestamp
            1 UnsupportedOperationException: function: typeof
            1 UnsupportedOperationException: function: unix_date
            1 UnsupportedOperationException: function: unix_micros
            1 UnsupportedOperationException: function: unix_millis
            1 UnsupportedOperationException: function: unix_seconds
            1 UnsupportedOperationException: function: url_decode
            1 UnsupportedOperationException: function: url_encode
            1 UnsupportedOperationException: function: weekday
            1 UnsupportedOperationException: function: width_bucket
            1 UnsupportedOperationException: function: xpath
            1 UnsupportedOperationException: function: xpath_boolean
            1 UnsupportedOperationException: function: xpath_double
            1 UnsupportedOperationException: function: xpath_float
            1 UnsupportedOperationException: function: xpath_int
            1 UnsupportedOperationException: function: xpath_long
            1 UnsupportedOperationException: function: xpath_number
            1 UnsupportedOperationException: function: xpath_short
            1 UnsupportedOperationException: function: xpath_string
            1 UnsupportedOperationException: handle analyze is local
            1 UnsupportedOperationException: handle analyze semantic hash
            1 UnsupportedOperationException: list functions
            1 UnsupportedOperationException: unknown aggregate function: bitmap_or_agg
            1 UnsupportedOperationException: unknown aggregate function: count_if
            1 UnsupportedOperationException: unknown aggregate function: count_min_sketch
            1 UnsupportedOperationException: unknown aggregate function: grouping_id
            1 UnsupportedOperationException: unknown aggregate function: histogram_numeric
            1 UnsupportedOperationException: unknown aggregate function: percentile
            1 UnsupportedOperationException: unknown aggregate function: try_avg
            1 UnsupportedOperationException: unknown aggregate function: try_sum
            1 UnsupportedOperationException: unknown function: distributed_sequence_id
            1 UnsupportedOperationException: unknown function: product
            1 ValueError: Code in Status proto (StatusCode.INTERNAL) doesn't match status code (StatusCode.RESOURC...
            1 ValueError: The column label 'struct' is not unique.
            1 internal error: unknown attribute in plan 2399: ColUmn
            1 internal error: unknown attribute in plan 239: a
            1 internal error: unknown attribute in plan 2486: ColUmn
            1 internal error: unknown attribute in plan 2647: COLUMN
(-1)        0 AnalysisException: Invalid or Unsupported Configuration: could not find config namespace for key "ig...
(-1)        0 AssertionError: "Database 'memory:3a4345be-44e0-44d1-9cbf-15cd6be14b14' dropped." does not match "in...
(-1)        0 AssertionError: "Database 'memory:f2fb3bca-1bb7-49f3-9252-71cdcbf0f89e' dropped." does not match "in...
(-1)        0 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpv2lzo71y'
(-1)        0 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpz2hwefri'
Passed Tests Diff

(empty)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant